Neural Captioning for the ImageCLEF 2017 Medical Image Challenges
نویسندگان
چکیده
Manual image annotation is a major bottleneck in the processing of medical images and the accuracy of these reports varies depending on the clinician’s expertise. Automating some or all of the processes would have enormous impact in terms of efficiency, cost and accuracy. Previous approaches to automatically generating captions from images have relied on hand-crafted pipelines of feature extraction and techniques such as templating and nearest neighbour sentence retrieval to assemble likely sentences. Recent deep learning-based approaches to general image captioning use fully differentiable models to learn how to generate captions directly from images. In this paper, we address the challenge of end-to-end medical image captioning by pairing an imageencoding convolutional neural network (CNN) with a language-generating recurrent neural network (RNN). Our method is an adaptation of the NICv2 model that has shown state-of-the-art results in general image captioning. Using only data provided in the training dataset, we were able to attain a BLEU score of 0.0982 on the ImageCLEF 2017 Caption Prediction Challenge and an average F1 score of 0.0958 on the Concept Detection Challenge.
منابع مشابه
Convolutional Image Captioning
Image captioning is an important but challenging task, applicable to virtual assistants, editing tools, image indexing, and support of the disabled. Its challenges are due to the variability and ambiguity of possible image descriptions. In recent years significant progress has been made in image captioning, using Recurrent Neural Networks powered by long-short-term-memory (LSTM) units. Despite ...
متن کاملShow-and-Fool: Crafting Adversarial Examples for Neural Image Captioning
Modern neural image captioning systems typically adopt the encoder-decoder framework consisting of two principal components: a convolutional neural network (CNN) for image feature extraction and a recurrent neural network (RNN) for caption generation. Inspired by the robustness analysis of CNN-based image classifiers to adversarial perturbations, we propose Show-and-Fool, a novel algorithm for ...
متن کاملPaying More Attention to Saliency: Image Captioning with Saliency and Context Attention
Image captioning has been recently gaining a lot of attention thanks to the impressive achievements shown by deep captioning architectures, which combine Convolutional Neural Networks to extract image representations, and Recurrent Neural Networks to generate the corresponding captions. At the same time, a significant research effort has been dedicated to the development of saliency prediction ...
متن کاملAutomated Image Captioning Using Nearest-Neighbors Approach Driven by Top-Object Detections
The significant performance gains in deep learning coupled with the exponential growth of image and video data on the Internet have resulted in the recent emergence of automated image captioning systems. Two broad paradigms have emerged in automated image captioning, i.e., generative model-based approaches and retrieval-based approaches. Although generative model-based approaches that use the r...
متن کاملJoint Learning of CNN and LSTM for Image Captioning
In this paper, we describe the details of our methods for the participation in the subtask of the ImageCLEF 2016 Scalable Image Annotation task: Natural Language Caption Generation. The model we used is the combination of a procedure of encoding and a procedure of decoding, which includes a Convolutional neural network(CNN) and a Long Short-Term Memory(LSTM) based Recurrent Neural Network. We f...
متن کامل